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Abstract. Preference rankings virtually appear in all field of science (po¬ 
litical sciences, behavioral sciences, machine learning, decision making and 
so on). The well-know social choice problem consists in trying to hnd a 
reasonable procedure to use the aggregate preferences expressed by subjects 
(usually called judges) to reach a collective decision. This problem turns out 
to be equivalent to the problem of estimating the consensus (central) rank¬ 
ing from data that is known to be a NP-hard Problem. Emond and Mason 
in 2002 proposed a branch and bound algorithm to calculate the consensus 
ranking given n rankings expressed on m objects. Depending on the complex¬ 
ity of the problem, there can be multiple solutions and then the consensus 
ranking may be not unique. We propose a new algorithm to hnd the consen¬ 
sus ranking that is equivalent to Emond and Mason’s algorithm in terms of 
at least one of the solutions reached, but permits a really remarkable saving 


1 



in computational time. 

Keywords; Preference rankings, Consensus ranking, Kemeny distance. So¬ 
cial choice problem. Branch and bound algorithm 


1 Introduction 


The consensus ranking problem, also known as social choice problem, arises 
any time n subjects (or judges) are asked to express their preferences on a 
set of m objects. These objects are placed in order by each subject (where 1 
represents the best and m the worst) without any attempt to describe how 
much one differs from the others or whether any of the alternatives is good 
or acceptable. Every independent observation is a permutation of m distinct 
positive integer numbers. To be more specific, when the subject assigns the 
integer values from 1 to m to all the m items we have a complete (or full) 
ranking. Whenever instead the judge fails to distinguish between two or more 
items and assigns to them the same integer number (expressing indifference 
to the relative order of this set of items), we deal with tied (or weak) rankings. 
Moreover we have a partial ranking when judges are asked to rank a subset 
of the entire set of objects (e.g. pick the three most favourite items out of a 
set of five) |Marden(1996) D’Ambrosio and Heiser(2014)| . Rankings are by 
nature peculiar data in the sense that the sample space of m objects can be 
only visualized in a (m — l)-dimensional hyperplane by a discrete structure 
that is called the permutation polytope, Sm- A polytope is a convex hull of 
a finite set of points in |Thompson(1993) Heiser(2004)| . For example 
the space considering 4 objects with all possible ties is a truncated octahe¬ 
dron that can be visualized in Figure [Heiser and D’Ambrosio(2013)| . As 
we already pointed out, the permutation polytope is inscribed in a (m — 1)- 
dimensional subspace, hence, for m > 4, such structures are impossible to 
visualize. The permutation polytope is the natural space for ranking data. 
To define it no data are required, it is completely determined by the num¬ 
ber of items involved in the preference choice; data add only information on 
which rankings occur and with what frequency they occur. This space is 
discrete and finite. It is characterized by symmetries and it is endowed with 
a graphical metric. 

The problem of combining rankings to obtain a ranking representative of the 
group has been studied by numerous researchers in several areas, e.g. voting 
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Figure 1: Permutation polytope for all full and weak ranking for four objects. 
For every ranking the correspondent ordering is shown. 


systems, economics, machine learning, psychology, political sciences, for more 
than two centuries. In the framework of distance-based models for rankings, 
searching for consensus ranking is a very important step in modeling the 
ranking pr ocess |Marden(1996)| . These models are usually exponential fam¬ 
ily models |Diaconis(1988)| and they are completely specihed by two param¬ 
eters, a dispersion parameter and a consensus (central) ranking. Maximum 
likelihood estimates of the dispersion parameter assume the knowledge of the 
central ranking. When the consensus ranking is not known it should be esti¬ 
mated. Unfortunately, even if there are close formulas for this estimation they 
are not feasible because of the complexity of the problem |Critchlow(1985)| 


Fligner and Verducci(1986), Fligner and Verducci(1988), Diaconis(1988), Critchlow et al.(1991)| 


Several methods to aggregate individual preference rankings have been pro¬ 
posed since the works of de Borda(1781)|, |de Condorcet(1785)|, Black(1958)|, _ 

|Arrow(1951)| , [Goodman and Markowitz(1952)| , |Coombs(1964) , [Davis et al.(1972)| . 
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|Bogart(1973)|, [Cook and Seiford(1978)|, [Barthelemy and Monjardet(1981) , 
[Emond and Mason(2002)| and [Meila et al.(2012)J ! ' 

In this paper, we propose two heuristic algorithms called QUICK and FAST 
to derive the consensus ranking from the aggregation of individual prefer¬ 
ences within the Kemeny and Snell axiomatic framework. Both algorithms 
can be viewed as alternatives to the branch-and-bound algorithm by Emond 
and Mason. The BB algorithm turns out to be a time consuming procedure 
when the number of objects is high and especially when the internal degree 
of consensus present in the data is weak. Both QUICK and EAST algorithms 
can deal with complete and tied rankings as well as with incomplete (or par¬ 
tial) rankings. As a matter of fact, the QUICK algorithm is the building 
block of the EAST algorithm. Both provide savings in computational time, 
but the EAST algorithm is more accurate because it hnds more than one 
of the solutions found by the BB algorithm and it can also easily deal with 
problems characterized by a large number of objects to be ranked and weak 
and partial rankings and/or a low degree of internal consensus. On the other 
hand, the QUICK algorithm turns out to be really useful when the number 
of objects is limited because it returns one of the solutions found by the BB, 
or a really close solution, in a considerably small amount of time. 

The rest of the paper is organized as follow. In Section 2 we briefly present 
some of the proposed approaches to aggregate preference rankings and derive 
a consensus. In Section 3 we describe the branch-and-bound algorithm by 
Emond and Mason. Section 4 is devoted to describe the proposed algorithms, 
then in sections 5 and 6 we present a simulation study and applications on 
real data to evaluate both the accuracy and the efficiency of our proposal. 
Concluding remarks are then found in section 7. 


2 Finding the consensus ranking, some ap¬ 
proaches 

The term consensus ranking is a generic name for any ranking that summa¬ 
rizes a set of individual rankings. There exist two broad classes of approaches 
to aggregate preference rankings in order to hnd a consensus |Cook(2006)] : 

• ad hoc methods, which can be divided into elimination (for example the 
American system method, the pairwise majority rule, etc.) and non- 
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elimination (for example Borda’s methods of marks (1781), Condorcet’s 
method (1785), etc.); 

• distance-based models, according to which it’s necessary to dehne a 
distance of the desired consensus from the individual rankings. 

A more detailed description of both these approaches can be found in |Cook(2006)| . 
How to aggregate subjects preferences to create a consensus is a problem that 
goes back to 1781 when Borda formulated the method of marks (also known 
as Borda’s count) for determining the winner in elections with more than 2 
candidates. This method is quite simple and it is based on calculating the 
total rank for each alternative. For example, if we consider the rankings in 
Table the total rank for each alternative is given by: 

Table 1: Example data to illustrate Borda’s method of marks. 


Alternatives 

^ voters 

A 

B C 

12 

2 

1 3 

5 

1 

2 3 

7 

3 

2 1 


• A = 12x2 + 5xl + 7x3 = 50, 

• 5 = 12x1 + 5x2 + 7x2 = 36, 

• C' = 12x3 + 5x3 + 7xl = 58, 

resulting in the consensus {BAC). Borda’s method of marks was criticized 
by Condorcet, which proposed to use the majority rule on all the pairwise 
comparisons between alternatives. Condorcet’s solution for the rankings re¬ 
ported in Table can be obtained by calculating the support obtained by 
every pairwise comparison between options, reported in Table From Table 
l^we can deduce that B >- A, B >- C and A + C, resulting also in the con¬ 
sensus ranking (BAC). In applying this method, unfortunately, one problem 
can be encountered, i.e. if intransitive preferences occur the simple majority 
procedure breaks down {paradox of voting |Arrow(1951)| , according to which 
a set of transitive preferences can generate a global intransitive preference as 
group preference). 
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Table 2: Support table to illustrate Condorcet’s method on example data 



A 

B 

C 

A 

- 

5 

17 

B 

19 

- 

17 

C 

7 

7 

- 


In the last century the rank aggregation problem has been approached from 
a statistical perspective. Kendall(1938) was the hrst to propose a method 
to aggregate input rankings to hnd a consensus. He studied the consensus 
problem as a problem of estimation and he proposed to rank items according 
to the mean of the ranks assigned, thus proposing a method equivalent to 
Borda’s one. Moreover he suggested to consider the Spearman rank correla¬ 
tion coefficient p, that, given two preference rankings R and R*, is dehned 
as: 

QEl.dl 


p = 1 


n-’ 


n 


( 1 ) 


where d‘f{R,R*) = YlT=i^dij—R*Y is the squared difference between rankings 


R and R* [Kendall (1948) page 8]. The Spearman’s p is equivalent to the 


product moment correlation coefficient and it treats rankings as they are 
scores summing the square of ranked differences. 

|Kendall(1938)j proposed his own correlation coefficient, named after him 
as Kendall r, by introducing the concept of ranking matrices. The ranking 
matrix associated with the ranking Ri of m objects, is a m x m matrix {a^} 
whose elements are dehned as: 


CLij 


1 if object i is ranked ahead of object j 

— 1 if object i is ranked behind object j 

0 if the objects are tied, or if i = j 


( 2 ) 


The Kendall correlation coefficient r between two rankings, R, with score 
matrix {ap}, and R*, with score matrix {ftp}, can be then dehned as the 
generalized correlation coefficient: 


t{R,R*) = 


E m \-^m 7 

i=l ^ 7=1 


Z^i=l Z^j=l ^ij Z^i=l Z^j=l ^ij 


(3) 


In the same period |Kemeny(1959)| and Kemeny and Snell(1962)| proposed 
and proved an axiomatic approach to hnd a unique distance measure for 
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rankings and define a consensus ranking. They introduced four axioms, re¬ 
ported in Table that should apply to any distance measure between two 
rankings. They also proved the existence of a distance metric that satishes 

Table 3: Kemeny and Snell axioms 


1. Axiom 1: d{Ri, R2) satisfies the three standard properties of a metric (or 
distance): 

(a) Positivity, d{Ri, R2) > 0, with equality if and only if Ri = R2. 

(b) Symmetry, d{Ri,R2) = d{R2,Ri)- 

(c) Triangular inequality, < d{Ri, R2) + R'i) for any three 

rankings Ri, R2, R3, with equality holding if and only if ranking R2 is 
between Ri and R 3 . 

2. Axiom 2 : Invariance 

d{Ri,R 2 ) = d{R'i, R' 2 ), where R'^ and R '2 result from Ri and R 2 respectively 
by the same permutation of the alternatives. 

3. Axiom 3: Consistency in measurement 

If two rankings i?i and R 2 agree except for a set S' of A: elements, which is 
a segment of both, then d{Ri,R2) may be computed as if these k objects 
were the only objects being ranked. 

4. Axiom 4: Scaling 

The minimum positive distance is 1. 


all these axioms, known as Kemeny distance, and its uniqueness. By using 
the score matrices as dehned by Kendall, Kemeny’s distance between two 
rankings R (with score matrix {ap}) and R* (with score matrix {&p}) is 
dehned as: 

^ m m 

dKem{R, -R*) = 9 X] l«P - K\ ■ (4) 

i=l j=l 

Kemeny and Snell then suggested the idea to use this distance function to 
dehne the median ranking as a specihc dehnition of consensus ranking. Ac¬ 
cording to their dehnition, the median ranking is the point in the ranking 
space that shows the best agreement with the set of input rankings. More 
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formally, given a set of n independent inpnt rankings the median 

ranking S is the point (or the points) for which ^11=1 d{Ri, S) is a minimum. 
Following the Kemeny and Snell approach, the research of the median rank¬ 
ing requires searching the space of all possible rankings of m object. Given 
a set of n independent input rankings the problem consists in hnding the 
ranking S that best represents the combined preferences of the judges. This 
is a NP-hard problem. When we have m objects, there are m\ possible 
complete rankings. In case we deal with tied rankings, the analysis is more 
complex as, by including ties, the number of possible rankings approximates 
|Gross(1962)| . In other words, the complexity of the search of 
the median ranking is entirely determined by the number of objects to be 
ranked. 


|Bogart(1973) Bogart(1975)| generalized the Kemeny and Snell approach by 
considering both transitive and intransitive preferences. Cook and Saipe(1976)| 
proposed a branch-and-bound algorithm to determine the median ranking out 
of a set of n independent preference rankings deriving a solution by adjacent 
pairwise optimal rankings. Emond and Mason (2002) pointed out that Cook 
and Saipe’s method does not guarantee that all solutions are found and in 
some examples local optima were encountered. Cook et ah (1997)] proposed 
a general representation of distance-based consensus with the aim of associat¬ 
ing a value to rank positions and developed models for deriving a consensus. 

Cook et ah (2007)] presented a branch-and-bound algorithm for hnding the 


consensus ranking in presence of partial rankings, but not allowing for ties. 
Emond and Mason(2002)'] proposed a new rank correlation coefficient called 
Tx that is equivalent to the Kemeny and Snell distance metric. They dehned 
the score matrices in a slightly different way respect to the Kendall’s repre¬ 
sentation: Qij = 1 if object i is either ranked ahead or tied with object j, 
and Qij = 0 only if i = j. Using these score matrices, they dehned their rank 
correlation coefficient as: 


Tx 


E m \-^m 7 

i=l ^j=l 


m{m — 1) 


(5) 


Note that Tx is equivalent to Kendall’s r when ties are not allowed. By using 
this correlation coefficient they proposed a branch-and-bound algorithm to 
deal with the median ranking problem when the number of object m is at 
most equal to 20 in a reasonable computing time. Given n weak orderings 
of m objects, Ri,...,Rn, where each ordering carries a positive weight, Wk, 
median ranking S is the one (or the ones) that maximizes the weighted 
























average correlation with the n input rankings or, equivalently, is the one (or 
the ones) that minimizes the weighted average Kemeny distance to the n 
input rankings. 


max 


E n 

k=l 


( 6 ) 


Indicating as {sp} and the scoring matrices for S and the kth ordering 

R, A; = 1..., n, the problem is: 


n 

max E Wk 

k=l 


EE 

1=1 j=i 


(k) 


m m 

max EE^.^«. 

1=1 j=i 


(7) 


where Cij = JYk=i'^k'f'lj'^■ The score matrix {cij} was called by Emond and 
Mason Combined Input Matrix (Cl) because it is the result of a summation 
of each input ranking. Dehned in this way, it summarizes the rankings infor¬ 
mation in a single matrix. 

Emond and Mason conceived a branch-and-bound algorithm to maximize 
equation!^ by defining an upper limit on the value of that dot product. This 
limit, considering that the score matrix consists only of the values 1, 0 
and —1, is given by the sum of the absolute values of the elements of Cl: 


i=l j=l 


3 Emond and Mason’s branch-and-bound al¬ 
gorithm 

If a weak ordering of m objects is given as initial solution, it is possible to 
compute the associated score matrix {sjj} and evaluate the value of expres¬ 
sion Then it is possible to dehne an initial penalty P by subtracting this 
value from V. The problem is to search the set of all weak orderings of m 
objects to hnd those with the minimum penalty. This set can be divided into 
three mutually exclusive branches based on the relative position of the hrst 
two objects in the ordering represented in the initial solution, labeled as i 
and j. An incremental penalty for each of the branches can be calculated, 
by considering the corresponding elements Cij and cji of the Cl matrix, as 
specihed in Table If the incremental penalty for a branch is greater than 
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Table 4: Penalty computation in the BB algorithm 


Let 6P be the incremental penalty: 

• object i is preferred to object j (Branch 1): 

if Cij > 0 and Cji < 0, then 6P = 0 

if Cij > 0 and Cji > 0, then 5P = Cji 

if Cij < 0 and Cji > 0, then 5P = Cji — Cij 

• object i is tied with object j (Branch 2): 

if Cij > 0 and Cji < 0, then 5P = —Cji 

if Cij > 0 and Cji > 0, then 5P = 0 

if Cij < 0 and Cji > 0, then 5P = —Cij 

• object j is preferred to object i (Branch 3): 

if Cij > 0 and Cji < 0, then 5P = Cij — Cji 

if Cij > 0 and Cji > 0, then 6P = Cij 

if Cij < 0 and Cji > 0, then 6P = 0 


the initial penalty, then we do not consider it any longer because all orderings 
in that branch will have a penalty larger than the initial one. 

If the incremental penalty of a branch is smaller (or equal) than the initial 
penalty, we then consider the next object in the initial solution and create 
new branches by placing this object in all possible positions relative to the 
objects already considered. 

The algorithm continues in an iterative way by including all other objects 
until all branches to be considered are checked. The BB algorithm works 
with complete, incomplete and partial rankings. It deals with incomplete 
rankings thanks to the convention that unranked objects do not add any¬ 
thing in forming the combined input matrix. Emond and Mason stated that 
the computation time needed to reach a solution(s) depends both on the in¬ 
herent degree of consensus in the sample of judges and on the quality of the 
initial solution used to initialize the algorithm. For an extensive discussion 
on the branch-and-bound algorithm we refer to [Emond and Mason(2000y| 
Emond and Mason(2002)| . 
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4 FAST and QUICK algorithms 

The first element to be evaluated in developing our algorithm is the combined 
input matrix. This matrix contains all the information about the rankings 
expressed by all the subjects and, if it is a valid score matrix, then the 
median ranking can be found immediately. Unfortunately such a situation 
rarely happens. But by evaluating the Cl with more attention it is possible 
to identify a good candidate to be the median ranking that can be used as 
an input in the algorithm. Let Q = 1 be a vector of ones of size m. Let 
{cij} be the m x m combined input matrix. By taking into account all the 
combinations of m objects, each pair of items is evaluated once by considering 
the two associated cells in CL A moderately accurate hrst candidate to be 
the median ranking can be computed as follow: 

If sign Cij = 1 & sign Cjt = — 1, then Qi = Qi + 1; 

If sign Cij = — 1 & sign Cji = 1, then Qj = Qj + 1] 

If sign Cij = 1 & sign Cji = 1, then Qi = Qi + l,Qj = Qj + l. 

In this way, we obtain the updated rank vector Q containing the number of 
times each object is preferred to the others in the pairwise comparisons. This 
vector is the starting point of our algorithm. The hrst step is to compute 
the score matrix, {%•}, associated with Q. Then we compute the associated 
penalty as: 

P = V -'^CijQij ( 8 ) 

ij 

After this step we take into account the object in Q ranked at the second 
position, and we evaluate equation by placing that object in all possible 
positions relative to the object ranked ahead, including ties. In other words, 
in the hrst step the second ranked object is placed ahead, in a tie and be¬ 
hind the hrst object, keeping all other objects hxed in the initial position. 
The penalty (equation is then evaluated for these three rankings and we 
continue by evaluating only the ranking with the lowest penalty in the suc¬ 
cessive step, updating the candidate median ranking. Once the penalties are 
computed, we update the candidate by selecting the ranking that is associ¬ 
ated with the minimum penalty. Subsequently we add the object ranked in 
the third position in the initial Q vector, and again we compute the values 
of equation by placing that object in all possible positions relative to the 
objects already ranked ahead, including all possible ties. As before, we up¬ 
date the candidate median ranking by selecting the one that minimizes the 
penalty. We continue in this way until all the objects are processed and we 
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reach a possible solution. 

We use, then, the obtained solution as starting point for a new complete 
loop. The overall procedure is repeated again by considering also the reverse 
ranking of the initial Q vector as candidate median ranking. The complete 
algorithm is summarized in Box[^ Note that when we evaluate the penalty. 


Box 1 QUICK algorithm for the median ranking problem 
input {cy}, Q 

initialize: £x the rank of the hrst ranked object in Q 

(1.) consider the next ranked object in Q 

(2.) evaluate eq. |^for all the rankings obtained by placing that object 
in all possible positions wrt the hxed ranked objects 
(3.) store only the ranking associated with minimum value of eq. 

(4). £x the rank of the processed object and return to step (1.) until all 
objects in Q are processed 

Obtain the update ranking CR, and repeat all previous steps by replacing 
Q with CR 

output: CR = median ranking. 


we consider all the objects in the ranking that is considered as candidate 
solution. This is a fundamental difference with the original algorithm, be¬ 
cause Emond and Mason calculate the penalty values only by considering 
the elements of the combined input matrix associated with the processed 
objects, and updating the penalty by adding up these partial values. Indeed, 
we never use this penalty update. 

We call this algorithm “QUICK” because it is able to reach at least one so¬ 
lution, or a solution really close to the true one, in few seconds even when 
working with a huge number of objects. In our experience, by using our 
dehnition of starting point Q, at least one solution is found. But, sometimes, 
solutions were also reached with random starting points. For this reason, we 
decided to use the QUICK algorithm as building block of our accurate FAST 
algorithm for the median ranking problem, whose pseudo-code is shown in 
Box 1^ Of course, our FAST algorithm is useful when the complexity of the 
problem is really intractable, e.g. when the number of objects to be ranked 
is high and the internal degree of consensus is low. Among the solutions 
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Box 2 Accurate FAST algorithm 

input {dj} 
for iter=l:maxiter do 
if iter=l then 

CR=QUICK(Q,{cij}), with Q as dehned before 
store CR 
else 

Q=random permutation of m objects 

CR=QUICK(Q,{ci,}) 

store CR 

output: CR=CR:r 3 ;=max 


returned by the QUICK algorithm, the median rankings are those showing 
the highest value of the average Tx rank correlation coefficient. 

5 Simulation study 

We implemented the BB algorithm by Emond and Mason, as well as both 
the QUICK and FAST algorithms in MatLab and in R environments. The 
reported results are based on codes written in MatLab language. A beta 
version of the R ConsRank package is available upon request to the authors, 
as well as the MatLab codes. Analysis were made by using a Computer Intel 
Core i5-3317U 1.70 GHz and 4GB of RAM. 

To evaluate the performance of our algorithms in terms of accuracy and 
efficiency, we performed a simulation study. Ranking data were simulated 
according to a distance-based model by selecting three different levels of the 
dispersion parameter 6*, which governs the degree of consensus in the sample 
of rankings. In the distance-based models framework, for a given consensus, 
S', a distance function, d, and some real parameter 9, the density with respect 
to the Uniform distribution is equal to 

fe{a-S) = C{e)exp{-ed{S,a)), 

where a is a ranking and C{6) is a normalizing constant. For more details on 
distance-based models we refer to |Marden(1996)| , [Feigin and Gohen(1978)| 
and [Gritchlow et al.(1991)| . 

The three levels chosen for 6 were 0.7, 0.4 and 0.1, the distance used was the 
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Kemeny distance. We decided to consider 4 different levels for m: 4, 9, 15 and 
20. In the case of 4 and 9 objects, we repeated the experiment both consider¬ 
ing only complete rankings and the fnll space of complete and tied rankings, 
while in the case of 15 and 20 objects we decided to limit the experiment 
only to complete rankings sampled from a limited snb-popnlation of size 10 
millions. These snb-popnlations were generated from the fnll rankings space 
of 10 objects by adding the remaining objects in snch a way that they were at 
first ranked below, later ranked ahead, and then randomly ranked in a middle 
position. Sample size was always eqnal to 200. Another experiment involved 
incomplete rankings. We chose a scheme of the type “pick k ont of m”, and 
precisely: pick 2 ont of 4, pick 5 ont of 9 and pick 10 ont of 15. Rankings were 
sampled in this way: hrst we extracted a random nnmber of rankings (from a 
minimnm of 15 to a maximnm of 30) according to the nniform distribntion by 
setting 6 = 0 from the corresponding spaces, then we generated the weights 
from a normal distribntion with means randomly generated between 10 and 
30 and standard deviations randomly generated between 2.5 and 9. After 
we normalized the weights and mnltiplied them by the total sample size to 
have data sets approximatively of size 200. Each experiment was repeated 
ten times, for globally 240 data sets. Table snmmarizes the experimental 
design. For each data set we ran the BB, the QUICK and the FAST algo¬ 
rithms. We checked the median rankings fonnd by the three algorithms as 
well as the elapsed time in seconds to reach the solntions. We nsed the BB 
algorithm as benchmark to check the accnracy of onr algorithms in terms of 
solntions. The initial solntion for all the algorithms was the npdated rank 
vector Q as dehned in section 4. 

Table shows in the hrst colnmn a snmmary of the solntions reached by the 
BB algorithm. In the second and in the third colnmns respectively snmmary 
measures of the coincident solutions returned by the QUICK and FAST al¬ 
gorithms with respect to the ones handed back by the Emond and Mason’s 
one are shown. Note that always both QUICK and FAST algorithms found 
at least one solution, and the proportion of solutions found by the FAST 
algorithm was always higher (or equal) to the one returned by the QUICK. 
There were no relevant differences among the factors of the experimental de¬ 
sign except, as expected, that the lower 6, the higher the number of solutions 
identihed. This is due to the fact that in this particular experiment there was 
a moderate internal degree of consensus present in the data, even when 6 was 
set equal to 0.1. Table reports the solutions returned by the BB algorithm 
and the number of coincident solutions recovered by the QUICK and FAST 
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Table 5: Experimental factors by levels 


Objects 

Rankings 

6\Distribution 



0.7 


Full 

0.4 

4 


0.1 



0.7 


Tied 

0.4 



0.1 



0.7 


Full 

0.4 

9 


0.1 


0.7 


Tied 

0.4 



0.1 



0.7 

15 

Full 

0.4 



0.1 



0.7 

20 

Full 

0.4 



0.1 

pick 2 out of 4 

Incomplete 

Normal 

Uniform 

pick 5 out of 9 

Incomplete 

Normal 

Uniform 

pick 10 out of 15 

Incomplete 

Normal 

Uniform 


algorithms in the experiment with incomplete rankings. In this case, due 
to the sampling procedure, the internal degree of consensus in the data sets 
was quite poor. The experiments with 9 and 15 objects respectively count a 
maximum number of solutions equal to 31 and 7761. In one case the QUICK 
algorithm did not hnd one of the BB solutions, but it did not happen with the 
FAST algorithm. This particular case is helpful to understand why we called 
this algorithm “FAST”. The BB algorithm found 25 solutions in 24240.054 
seconds (~ 6.733 hours), each one with an average Tx equal to 0.106. The 
FAST algorithm could hnd 6 of the 25 solutions in 64.932 seconds. The two 
solutions found by the QUICK algorithm were found in 0.693 seconds and 
were really close to be real solutions because they were characterized by an 
average Tx equal to 0.104. This was the unique case in which the QUICK 
algorithm did not hnd one of the BB solutions. Figures and show the 
distribution of working time of both BB and QUICK algorithms. We do not 
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Table 6: Summary measures of the number of solutions reached by BB algo¬ 
rithm and of the number of coincident solutions found by QUICK and FAST 
by number of objects, experiment with complete and tied rankings. 



BB solutions 

QUICK FAST 

4 objects 

Mean 

1.2 

1.1 

1.2 

Median 

1.0 

1.0 

1.0 

Minimum 

1.0 

1.0 

1.0 

Maximum 

3.0 

2.0 

3.0 

9 objects 

Mean 

1.2 

1.1 

1.2 

Median 

1.0 

1.0 

1.0 

Minimum 

1.0 

1.0 

1.0 

Maximum 

9.0 

2.0 

5.0 

15 objects 

Mean 

2.6 

1.4 

1.9 

Median 

1.0 

1.0 

1.0 

Minimum 

1.0 

1.0 

1.0 

Maximum 

18.0 

3.0 

6.0 

20 objects 

Mean 

2.6 

1.2 

1.9 

Median 

1.0 

1.0 

1.0 

Minimum 

1.0 

1.0 

1.0 

Maximum 

9.0 

2.0 

5.0 


show the box-plots relative to the FAST algorithm because its computing 
time was approximately equal to the number of iterations multiplied by the 
computing time of the QUICK algorithm. As it can be noted, the QUICK 
algorithm is on average faster than the BB algorithm, and the variability of 
the computing time increases as the value of 9 decreases. 

Table summarizes the computing time for the experiment involving in¬ 
complete rankings. The computation time for the QUICK algorithm has 
not a considerable variability while, especially in the case of 15 objects, BB 
computational time shows a higher variability. 


16 











Table 7: Summary measures of the number of solutions reached by BB al¬ 
gorithm and of number of coincident solutions found by QUICK and FAST, 
experiment with incomplete rankings. 



BB solutions 

QUICK 

FAST 

2 out of 4 

Mean 

1.5 

1.3 

1.5 

Median 

1.0 

1.0 

1.0 

Minimum 

1.0 

1.0 

1.0 

Maximum 

3.0 

2.0 

3.0 

5 out of 9 

Mean 

7.4 

2.1 

3.7 

Median 

4.0 

2.0 

2.5 

Minimum 

1.0 

1.0 

1.0 

Maximum 

31.0 

4.0 

12.0 

10 out of 15 

Mean 

451.0 

1.6 

13.1 

Median 

8.0 

1.5 

4.0 

Minimum 

1.0 

0.0 

1.0 

Maximum 

7761.0 

3.0 

102.0 


Table 8: Summary measures of elapsed times (in seconds) for finding the 
solutions 



BB 

QUICK 

FAST 

2 out of 4 

Mean 

0.031 

0.012 

0.337 

Median 

0.012 

0.010 

0.318 

Minimum 

0.009 

0.008 

0.261 

Maximum 

0.097 

0.027 

0.595 

5 out of 9 

Mean 

0.282 

0.170 

14.328 

Median 

0.287 

0.185 

16.278 

Minimum 

0.218 

0.063 

7.788 

Maximum 

0.378 

0.219 

16.398 

10 out of 15 

Mean 

1967.438 

0.745 

65.910 

Median 

255.663 

0.686 

66.103 

Minimum 

0.745 

0.660 

64.413 

Maximum 

24240.054 

1.343 

68.537 
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4 items 



Figure 2: Working time in second. The first row of box-plots refers to com¬ 
plete rankings, the second row refers to tied and complete rankings 

6 Real data applications 

The hrst real data application is about the data reported by Emond and 
Mason (2000, pag. 28) which are shown in Table The hrst 15 columns 
represent the objects to be ranked with labels in the hrst row, while the last 
column reports the weight associated with every ranking. By using the BB 
algorithm we obtained exactly the following solutions (as also reported by 
Emond and Mason, 2000, page 29), with an average Tx equal to 0.166: 

1. <D L (E-M) (A-B) I P (C-N) H F G (0-Q)> 

2. <D L (E-M) (A-B-P) (C-N) I H F G (0-Q)> 
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9 items 




0 = 0.1 



6 = 0.7 



0 = 0.4 



6 = 0.1 



Figure 3: Working time in second. The first row of box-plots refers to com¬ 
plete rankings, the second row refers to tied and complete rankings 


3. <D L (E-M) (B-P) A (C-N) I H F G (0-Q)> 


Computing time was equal to 5113.608 seconds. We ran the QUICK algo¬ 
rithm on these data obtaining solution number 3 in a computing time of 
0.155 seconds. Then we ran our FAST algorithm with 100 starting points, 
obtaining exactly all solutions with a computing time of 12.627 seconds. 

The second data set used to compare the computing time of the algorithms is 
the famous data set about voters for the 1980 election of American Psycholog¬ 
ical Association president |Diaconis(1988) Murphy and Martin(200^ . This 
data set contains the rankings expressed by 15,449 psychologists on hve can¬ 
didates: A = Bevan, B = Iscoe, C = Kiesler, D = Siegle and E = Wriths. 
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15 items 


e = o.7 



e = o.4 



e = o.i 



6 = 0.7 



20 items 

0 = 0.4 



6 = 0.1 



Figure 4: Working time in second.The first row of box-plots refers to complete 
rankings on 15 objects, the second row refers to complete rankings on 20 
objects. 


Of these rankings only 5,738 are complete, while the remaining are partial 
rankings. As shown in Table [T^ all the algorithms reached the same unique 
solution characterized by an average equal to 0.023. 

The third data set used is known as the Sports data set and it comes from 
Louis Roussos |Marden(1996)| . In this data 130 students of the University 
of Illinois were asked to rank seven sports according to their preference of 
participating in. The sports considered were: A = baseball, B = football, 
C = basketball, D = tennis, E = cycling, F = swimming and G = jogging. 
Also in this case there is a unique solution, and the results are reported in 
Table Also in this case all the algorithms reach the same unique solution 
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Table 9: Emond and Mason’s data 


A 

B 

C 

D 

E 

F 

G 

H 

I 

L 

M 

N 

0 

P 

Q 

Wk 

1 

6 

4 

5 

- 

1 

2 

7 

3 

1 

5 

2 

6 

5 

5 

4 

11 

10 

4 

8 

9 

1 

7 

12 

2 

3 

2 

6 

13 

5 

14 

4 

11 

12 

3 

11 

7 

1 

4 

5 

12 

2 

6 

10 

11 

8 

9 

4 

2 

4 

3 

3 

11 

8 

10 

9 

6 

10 

5 

1 

5 

7 

5 

5 

2 

8 

4 

8 

7 

1 

2 

5 

2 

3 

6 

7 

8 

- 

- 

4 

2 

9 

5 

1 

4 

3 

2 

7 

3 

1 

8 

6 

3 

4 

8 

5 

3 

9 

7 

1 

2 

8 

13 

6 

1 

10 

5 

11 

9 

4 

14 

5 

4 

2 

9 

1 

3 

12 

6 

10 

13 

14 

11 

9 

7 

8 

5 

5 

4 

3 

5 

11 

12 

10 

13 

7 

6 

8 

2 

1 

9 

9 

11 

7 

4 

7 

8 

6 

13 

2 

3 

12 

9 

1 

5 

10 

5 

11 

11 

4 

6 

1 

3 

3 

6 

2 

6 

5 

4 

5 

1 

1 

2 

1 

1 

5 

6 

10 

14 

5 

7 

1 

8 

3 

2 

3 

4 

11 

13 

12 

9 

4 

6 

6 

8 

1 

1 

3 

5 

1 

10 

7 

2 

10 

9 

4 

6 

7 

7 

2 

- 

1 

2 

10 

5 

3 

9 

8 

6 

7 

7 

6 

4 

5 

7 

4 

6 

1 

5 

14 

10 

12 

15 

3 

13 

9 

8 

2 

11 

5 

7 

8 

4 

5 

7 

1 

6 

5 

3 

2 

7 

9 

10 

11 

12 

4 

8 

4 

7 

2 

1 

11 

4 

6 

3 

12 

6 

10 

13 

5 

9 

7 

9 

8 

7 

6 

3 

4 

- 

2 

5 

1 

3 

7 

6 

4 

6 

7 

- 

- 

3 

1 

1 

5 

5 

4 

5 

2 

4 

2 

6 

7 

8 

7 

- 

- 

4 

7 

2 

10 

11 

5 

8 

8 

9 

1 

2 

3 

6 

7 

- 

- 

5 

6 

12 

9 

10 

8 

2 

11 

1 

4 

7 

2 

3 

7 


Table 10: Median ranking on APA data set 


Algorithm solntion elapsed time replications 

BB 

QUICK 

FAST 

<C A E D B> 

<C A E D B> 

<C A E D B> 

1.033 

0.764 

27.814 50 

Table 11: Median ranking 

on Sports data set 

Algorithm 

solution 

elapsed time replications 

BB 

QUICK 

FAST 

<E F C A D B G> 
<E F C A D B G> 
<E F C A D B G> 

0.076 

0.084 

3.592 50 


characterized by an average of 0.428, as reported in Table 11 


To test the ability of onr algorithms to deal with rankings with a large nnm- 
ber of objects the forth data set is a random snbset of the rankings collected 
by O’Leary Morgan and Morgon(2010)| on the 50 American States. The 
nnmber of items (the nnmber of American States) is eqnal to 50, and the 
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number of rankings is equal to 104. These data concern rankings of the 50 
American States on three particular aspects: socio-demographic character¬ 
istics (as population in 2008, GPD per capita, median household income, 
total expenditures, etc.), health care expenditures (as per capita hospital ex¬ 
penditures, % of people covered by health insurance, % of people covered by 
employment base insurance, etc.) and crime statistics (as crime rate, number 
of arrests, murder rate, etc.). It was unfeasible to run Emond and Mason’s 
algorithm on this data. The orderings corresponding to the three solutions 
found by the FAST algorithm, characterized by an average Tx equal to 0.298, 
are reported in Table 12 These solutions were obtained in 1177.274 seconds 
(~ 19 minutes) with 1000 iterations. The QUICK algorithm found 1 solution 
(solution 2 in Table 12) in 16.384 seconds. 


7 Concluding remarks 

In this paper we proposed two accurate algorithms (the QUICK and the 
FAST) to solve the problem of identifying the median ranking in situations 
involving full, weak and partial ranking. Our approach lies into the Kemeny 
and Snell theoretical framework. Our algorithms can be considered as an 
alternative to branch-and-bound algorithm proposed by Emond and Mason 
(2002). The BB algorithm results to be a time demanding procedure when 
the number of objects is high especially when the degree of internal consen¬ 
sus in the data is weak. Our approach is heuristic and, thus, it does not 
return all the possible solutions that can be found by an exhaustive search 
(as in the BB algorithm). For this reason it may happen that the QUICK 
does not reach a solution being stuck in a local optimum, however even if 
this happen the FAST, by repeatedly running the QUICK algorithm with 
random permutations of the m items, in our experiments, always identi- 
hes a global optimum. Nevertheless, Ending all the solutions in presence of 
multiple median rankings could not always be the final goal of the analysis, 
especially considering that the returned solutions are mutually coherent since 
they present the same value of the average Tx- We illustrated the performance 
of both these algorithms in terms of accuracy and computational efficiency 
via simulated and real data sets. As shown by the results of the simulation 
studies, when the number of objects is smaller than 15, the FAST algorithm 
on average recovers all the solutions handed back by the BB algorithm. On 
the other hand, when the number of objects is equal or higher than 15 the 
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Table 12: Median ranking fonnd by FAST algorithm, American states data 



solution 1 

solution 2 

solution 3 

1 


CA 


CA 

CA 

2 


NY 


NY 

NY 

3 


FL 


FL 

FL 

4 


MD 


MD 

MD 

5 


LA 


LA 

LA 

6 


NM 


NM 

NM 

7 


DE 


TX 

DE 

8 


TX 


IL 

TX 

9 


IL 


DE 

IL 

10 


PA 


PA 

PA 

11 


MI 


MI 

MI 

12 


GA 


GA 

GA 

13 


NC 


NC 

NC 

14 


NJ 


NJ 

NJ 

15 


MA 


MA 

MA 

16 


WA 


WA 

WA 

17 


OH 


OH 

OH 

18 


VA 


VA 

VA 

19 


TN 


TN 

TN 

20 


NV 


NV 

NV 

21 


AZ 


AZ 

AZ 

22 


MO 


MO 

MO 

23 


IN 


IN 

IN 

24 


AK 


AK 

AK 

25 


WI 


WI 

WI 

26 


CO 


CO 

CO 

27 


CT 


CT 

CT 

28 


MN 


MN 

MN 

29 


AL 


AL 

AL 

30 


SC 


SC 

SC 

31 


OR 


OR 

OR 

32 


OK 


OK 

OK 

33 


MS 


MS 

KY 

34 


AR 


AR 

MS 

35 


HI 


HI 

AR 

36 


KY 


KY 

HI 

37 

(KS 

- RI) 

(KS 

- RI) 

(KS - RI) 

39 


UT 


UT 

UT 

40 

(lA- 

- NE) 

(lA- 

- NE) 

(lA - NE) 

42 


WY 


WY 

WY 

43 


WV 


WV 

WV 

44 


ID 


ID 

ID 

45 


ME 


ME 

ME 

46 


MT 


MT 

MT 

47 


NH 


NH 

NH 

48 


SD 


SD 

SD 

49 


VT 


VT 

VT 

50 


ND 


ND 

ND 

Tx 


0.298 


0.298 

0.298 


FAST recovers on average the 70% of the BB solutions. The QUICK algo¬ 
rithm always hnds at least one of the solutions in a sensibly lower amount of 
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time with respect to the BB algorithm. When dealing with partial rankings 
and a weak internal degree of consensus, the FAST algorithm again shows a 
good performance. Indeed, even if it does not return all the BB solutions, 
it always returns more than one solution in a limited amount of time. In 
this case, the QUICK also hnds at least one BB solution in a considerably 
shorter time. Moreover, as can be noted from the real data analysis, when 
the number of objects is smaller than 20, the QUICK again always hnds 
one of the BB solutions in a shorter period of time respect to the BB al¬ 
gorithm. If the number of objects is greater than 20, as in the case of the 
50 American States data set, Emond and Masons algorithm is unfeasible, 
while the FAST hnds three solution in less than 20 minutes. To some extent, 
the impact of the result of our proposal can be compared to that one ob¬ 
tained by Mola and Siciliano(1997)| in the held of classihcation and regres¬ 
sion trees Breiman et al.(1984)|. As an example, [Siciliano and Mola(2000)| , 


D’Ambrosio et al.(2007)| and [D’Ambrosio et al.(2012)| considered the FAST 


algorithm to speed up the splitting procedure in tree growing that proved to 
be ehective respectively to deal with huge and complex data sets as well as to 
improve the computational cost of using ensemble methods and hnally to ac¬ 
celerate the missing data imputation within the statistical learning paradigm. 


Acknowledgments 

Work by Sonia Amodio was supported by Progetto Innosystem, POR Campania FSE 
2007/2013, CUP B25B09000070009. 

References 

[Arrow(1951)] Arrow, K. J., 1951. Social choice and individual val¬ 
ues. (Cowles Commission Mongr. No. 12.). Wiley. 

[Barthelemy and Monjardet(1981)] Barthelemy, P. J., Monjardet, B., 1981. 
The median procedure in cluster analysis and social choice theory. Math¬ 
ematical social sciences 1 (3), 235-267. 

[Black(1958)] Black, D., 1958. The theory of committees and elections. 


24 











[Bogart(1973)] Bogart, K. P., 1973. Preference structures i: Distances be¬ 
tween transitive preference relations. Journal of Mathematical Sociology 
3 (1), 49-67. 

[Bogart(1975)] Bogart, K. P., 1975. Preference structures, ii; Distances be¬ 
tween asymmetric relations. SIAM Journal on Applied Mathematics 
29 (2), 254-262. 

[Breiman et al.(1984)] Breiman, L., Friedman, J., Olshen, R. A., Stone, 
C. J., 1984. Classihcation and regression trees. CRC press. 

[Cook(2006)] Cook, W. D., 2006. Distance-based and ad hoc consensus mod¬ 
els in ordinal preference ranking. European Journal of Operational Re¬ 
search 172 (2), 369-385. 

[Cook et al.(2007)] Cook, W. D., Golany, B., Penn, M., Raviv, T., 2007. 
Creating a consensus ranking of proposals from reviewers partial ordinal 
rankings. Computers & Operations Research 34 (4), 954-965. 

[Cook et al.(1997)] Cook, W. D., Kress, M., Seiford, L. M., 1997. A gen¬ 
eral framework for distance-based consensus in ordinal ranking models. 
European Journal of Operational Research 96 (2), 392-397. 

[Cook and Saipe(1976)] Cook, W. D., Saipe, A., 1976. Committee approach 
to priority planning: the median ranking method. Cahiers du Centre 
dEtudes de Recherche Operationnelle 18 (3), 337-351. 

[Cook and Seiford(1978)] Cook, W. D., Seiford, L. M., 1978. Priority ranking 
and consensus formation. Management Science 24 (16), 1721-1732. 

[Coombs(1964)] Coombs, C. H., 1964. A theory of data. Wiley. 

[Critchlow et al.(1991)] Critchlow, D. E., Fligner, M. A., Verducci, J. S., 
1991. Probability models on rankings. Journal of Mathematical Psy¬ 
chology 35 (3), 294-318. 

[Critchlow(1985)] Critchlow, Douglas E, P. R., 1985. Lectu re N otes in 
Statistics. Springer. 

[D’Ambrosio et al.(2007)] D’Ambrosio, A., Aria, M., Siciliano, R., 2007. Ro¬ 
bust tree-based incremental imputation method for data fusion. In: Ad¬ 
vances in Intelligent Data Analysis VII. Springer, pp. 174-183. 


25 



[D’Ambrosio et al.(2012)] D’Ambrosio, A., Aria, M., Siciliano, R., 2012. Ac¬ 
curate tree-based missing data imputation and data fusion within the 
statistical learning paradigm. Journal of classihcation 29 (2), 227-258. 

[D’Ambrosio and Heiser(2014)] D’Ambrosio, A., Reiser, W. J., 2014. A re¬ 
cursive partitioning method for the prediction of preference rankings 
based upon Kemeny distances. Technical Report University of Napoli 
Federico II, Submitted. 

[Davis et ah(1972)] Davis, O. A., DeGroot, M. H., Hinich, M. J., 1972. Social 
preference orderings and majority rule. Econometrica: Journal of the 
Econometric Society, 147-157. 

[de Borda(1781)] de Borda, J. C., 1781. Memoire sur les elections au scrutin. 
Histoire de I’Academie Royale des Sciences. 

[de Condorcet(1785)] de Condorcet, M., 1785. Essai sur I’application de 
I’analyse a la probabilite des decisions rendues a la pluralite des voix 
(Essay on the Application of Analysis to the Probability of Majority 
Decisions). 

[Diaconis(1988)] Diaconis, P., 1988. Group representations in probability and 
statistics. Lecture Notes-Monograph Series, i-192. 

[Emond and Mason(2000)] Emond, E., Mason, D., 2000. A new technique 
for high level decision support. Department of National Defence, Oper¬ 
ational Research Division. 

[Emond and Mason(2002)] Emond, E., Mason, D., 2002. A new rank cor¬ 
relation coefficient with application to the consensus ranking problem. 
Journal of Multi-Griteria Decision Analysis 11 (1), 17-28. 

[Feigin and Gohen(1978)] Feigin, P. D., Gohen, A., 1978. On a model for 
concordance between judges. Journal of the Royal Statistical Society. 
Series B (Methodological), 203-213. 

[Fligner and Verducci(1986)] Fligner, M. A., Verducci, J. S., 1986. Distance 
based ranking models. Journal of the Royal Statistical Society. Series B 
(Methodological), 359-369. 


26 



[Fligner and Verducci(1988)] Fligner, M. A., Verducci, J. S., 1988. Multi¬ 
stage ranking models. Journal of the American Statistical Association 
83 (403), 892-901. 

[Goodman and Markowitz(1952)] Goodman, L. A., Markowitz, H., 1952. So¬ 
cial welfare functions based on individual rankings. American Journal 
of Sociology, 257-262. 

[Gross(1962)] Gross, O. A., 1962. Preferential arrangements. American 
Mathematical Monthly, 4-8. 

[Heiser(2004)] Heiser, W. J., 2004. Geometric representation of association 
between categories. Psychometrika 69 (4), 513-545. 

[Heiser and D’Ambrosio(2013)] Heiser, W. J., D’Ambrosio, A., 2013. Glus- 
tering and prediction of rankings within a kemeny distance framework. 
In: Algorithms from and for Nature and Life. Springer, pp. 19-31. 

[Kemeny(1959)] Kemeny, J. G., 1959. Mathematics without numbers. 
Daedalus 88 (4), 577-591. 

[Kemeny and Snell(1962)] Kemeny, J. G., Snell, J. L., 1962. Mathematical 
models in the social sciences. Vol. 9. Ginn Boston. 

[Kendall(1938)] Kendall, M. G., 1938. A new measure of rank correlation. 
Biometrika, 81-93. 

[Kendall(1948)] Kendall, M. G., 1948. Rank correlation methods. 

[Marden(1996)] Marden, J. I., 1996. Analyzing and modeling rank data. GRG 
Press. 

[Meila et ah(2012)] Meila, M., Phadnis, K., Patterson, A., Bilmes, J. A., 
2012. Gonsensus ranking under the exponential model. arXiv preprint 
arXiv:1206.5265, 

[Mola and Siciliano(1997)] Mola, F., Siciliano, R., 1997. A fast splitting pro¬ 
cedure for classihcation trees. Statistics and Gomputing 7 (3), 209-216. 

[Murphy and Martin(2003)] Murphy, T. B., Martin, D., 2003. Mixtures of 
distance-based models for ranking data. Gomputational statistics & data 
analysis 41 (3), 645-655. 


27 


[O’Leary Morgan and Morgon(2010)] O’Leary Morgan, K., Morgon, S., 
2010. State Rankings 2010: A Statistical view of America; Crime State 
Ranking 2010: Crime Across America; Health Care State Rankings 
2010: Health Care Across America. CQ Press. 

[Siciliano and Mola(2000)] Siciliano, R., Mola, F., 2000. Multivariate data 
analysis and modeling through classification and regression trees. Com¬ 
putational Statistics & Data Analysis 32 (3), 285-301. 

[Thompson(1993)] Thompson, G., 1993. Generalized permutation polytopes 
and exploratory graphical methods for ranked data. The Annals of 
Statistics, 1401-1430. 



